docs: verify reduction #841 — NAE 3-SAT → Set Splitting by isPANN · Pull Request #983 · CodingThrust/problem-reductions

isPANN · 2026-04-01T07:59:39Z

Summary

Formal Typst proof of NAE 3-SAT → Set Splitting reduction (Lovász 1973 / Garey & Johnson SP4)
Constructor verification script: 41,107 checks, 0 failures (7 mandatory sections, exhaustive n=3, sampled n=4,5)
Adversary verification script: 19,832 checks, 0 failures (independent implementation, 3 hypothesis property-based strategies)
Cross-comparison: 900 instances, 0 disagreements, 0 feasibility mismatches

Verification Report

=== Verification Report: NAE 3-SAT → Set Splitting (#841) ===

Typst proof: docs/paper/proposed-reductions-841.typ
  - Construction: ✓ (3 steps: universe, complementarity, clause subsets)
  - Correctness: ✓ (⟹ + ⟸)
  - Extraction: ✓
  - Overhead: ✓ (universe_size = 2n, num_subsets = n + m)
  - YES example: ✓ (4 vars, 3 clauses, σ=(1,0,1,0))
  - NO example: ✓ (3 vars, 8 clauses = all sign patterns, every σ has all-true clause)

Constructor: verify_nae3sat_setsplitting.py
  - Checks: 41,107 passed, 0 failed
  - Sections: 1(sympy) 2(exhaustive) 3(extraction) 4(overhead) 5(structural) 6(YES) 7(NO)
  - Forward: exhaustive n=3, sampled n=4,5
  - Backward: exhaustive n=3, sampled n=4,5
  - Gap analysis: all claims covered

Adversary: adversary_nae3sat_setsplitting.py
  - Checks: 19,832 passed, 0 failed
  - Property-based: 3 hypothesis strategies (2,500 examples)
  - Forward: exhaustive n ≤ 5
  - Backward: exhaustive n ≤ 5
  - Bugs found: none

Cross-comparison:
  - Instances compared: 900
  - Structural agreement: 900/900
  - Feasibility agreement: 900/900

Bugs found: none
Iterations: 2 rounds (fixed NO example scratch work, fixed clause-matching index)

Verdict: VERIFIED

Test plan

Typst compiles without errors
Constructor script: python3 docs/paper/verify-reductions/verify_nae3sat_setsplitting.py → 0 failures
Adversary script: python3 docs/paper/verify-reductions/adversary_nae3sat_setsplitting.py → 0 failures
Cross-comparison: python3 docs/paper/verify-reductions/cross_compare_nae3sat_setsplitting.py → 0 disagreements

Closes #841

🤖 Generated with Claude Code

Covers 9 reductions: 2 NP-hardness chain extensions (#973, #198), 4 Tier 1a blocked issues (#379, #380, #888, #822), and 3 Tier 1b blocked issues (#892, #894, #890). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…uctions New skill: /verify-reduction <issue-number> End-to-end pipeline that takes a reduction rule issue and produces: 1. Typst proof (Construction/Correctness/Extraction/Overhead + YES/NO examples) 2. Python verification script (7 mandatory sections, ≥5000 checks, exhaustive n≤5) 3. Lean 4 lemmas (non-trivial structural proofs required) Follows issue-to-pr conventions: creates worktree, works in isolation, submits PR. Strict quality gates (zero tolerance): - No "trivial" category — every reduction ≥5000 checks - 7 mandatory Python sections including NO (infeasible) example - Non-trivial Lean required (rfl/omega tautologies rejected) - Zero hand-waving in Typst ("clearly", "obviously" → rejected) - Mandatory gap analysis: every proof claim must have a test - Self-review checklist with 20+ items across 4 categories Developed and validated through PR #975 (800K+ checks, 3 bugs caught) and tested on issues #868 (caught wrong example) and #841 (35K checks). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…kill - Added frontmatter (name, description) matching other skills' convention - Toned down aggressive language ("ZERO TOLERANCE", "THE HARSHEST STEP", "NON-NEGOTIABLE") to professional but firm language - All quality gates unchanged — same strictness, better presentation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Replaces Lean-required gates with adversarial second agent: - Step 5: Adversary agent independently implements reduce() and extract_solution() from theorem statement only (not constructor's script) - Step 5c: Cross-comparison of both implementations on 1000 instances - Lean downgraded from required to optional - hypothesis property-based testing for n up to 50 - Quality gates: 2 independent scripts ≥5000 checks each + cross-comparison Design rationale (docs/superpowers/specs/2026-04-01-adversarial-verification-design.md): - Same agent writing proof + test is the #1 risk for AI verification - Two independent implementations agreeing > one + trivial Lean - Lean caught 0 bugs in PR #975; Python caught all 4 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Typst: Construction + Correctness (⟹ + ⟸) + Extraction + Overhead + YES/NO examples Constructor: 41,107 checks, 0 failures (exhaustive n=3, sampled n=4,5, 7 sections) Adversary: 19,832 checks, 0 failures (independent impl + 3 hypothesis strategies) Cross-comparison: 900 instances, all agree, 0 feasibility mismatches Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

codecov · 2026-04-01T08:04:24Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.03%. Comparing base (423506c) to head (e489546).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #983   +/-   ##
=======================================
  Coverage   98.03%   98.03%           
=======================================
  Files         784      784           
  Lines       82310    82310           
=======================================
  Hits        80695    80695           
  Misses       1615     1615

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

zazabap and others added 5 commits March 31, 2026 16:13

isPANN mentioned this pull request Apr 1, 2026

[Rule] NOT-ALL-EQUAL 3SAT to SET SPLITTING #841

Open

zazabap closed this Apr 4, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: verify reduction #841 — NAE 3-SAT → Set Splitting#983

docs: verify reduction #841 — NAE 3-SAT → Set Splitting#983
isPANN wants to merge 5 commits intomainfrom
verify/nae3sat-setsplitting-841

isPANN commented Apr 1, 2026

Uh oh!

codecov bot commented Apr 1, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

isPANN commented Apr 1, 2026

Summary

Verification Report

Test plan

Uh oh!

codecov bot commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov bot commented Apr 1, 2026 •

edited

Loading